8 research outputs found

    Visualizing the Diversity of Representations Learned by Bayesian Neural Networks

    Full text link
    Explainable Artificial Intelligence (XAI) aims to make learning machines less opaque, and offers researchers and practitioners various tools to reveal the decision-making strategies of neural networks. In this work, we investigate how XAI methods can be used for exploring and visualizing the diversity of feature representations learned by Bayesian Neural Networks (BNNs). Our goal is to provide a global understanding of BNNs by making their decision-making strategies a) visible and tangible through feature visualizations and b) quantitatively measurable with a distance measure learned by contrastive learning. Our work provides new insights into the \emph{posterior} distribution in terms of human-understandable feature information with regard to the underlying decision making strategies. The main findings of our work are the following: 1) global XAI methods can be applied to explain the diversity of decision-making strategies of BNN instances, 2) Monte Carlo dropout with commonly used Dropout rates exhibit increased diversity in feature representations compared to the multimodal posterior approximation of MultiSWAG, 3) the diversity of learned feature representations highly correlates with the uncertainty estimate for the output and 4) the inter-mode diversity of the multimodal posterior decreases as the network width increases, while the intra mode diversity increases. These findings are consistent with the recent Deep Neural Networks theory, providing additional intuitions about what the theory implies in terms of humanly understandable concepts.Comment: 16 pages, 18 figure

    FedZero: Leveraging Renewable Excess Energy in Federated Learning

    Full text link
    Federated Learning (FL) is an emerging machine learning technique that enables distributed model training across data silos or edge devices without data sharing. Yet, FL inevitably introduces inefficiencies compared to centralized model training, which will further increase the already high energy usage and associated carbon emissions of machine learning in the future. Although the scheduling of workloads based on the availability of low-carbon energy has received considerable attention in recent years, it has not yet been investigated in the context of FL. However, FL is a highly promising use case for carbon-aware computing, as training jobs constitute of energy-intensive batch processes scheduled in geo-distributed environments. We propose FedZero, a FL system that operates exclusively on renewable excess energy and spare capacity of compute infrastructure to effectively reduce the training's operational carbon emissions to zero. Based on energy and load forecasts, FedZero leverages the spatio-temporal availability of excess energy by cherry-picking clients for fast convergence and fair participation. Our evaluation, based on real solar and load traces, shows that FedZero converges considerably faster under the mentioned constraints than state-of-the-art approaches, is highly scalable, and is robust against forecasting errors

    Coordinated optimization of visual cortical maps (II) Numerical studies

    Get PDF
    It is an attractive hypothesis that the spatial structure of visual cortical architecture can be explained by the coordinated optimization of multiple visual cortical maps representing orientation preference (OP), ocular dominance (OD), spatial frequency, or direction preference. In part (I) of this study we defined a class of analytically tractable coordinated optimization models and solved representative examples in which a spatially complex organization of the orientation preference map is induced by inter-map interactions. We found that attractor solutions near symmetry breaking threshold predict a highly ordered map layout and require a substantial OD bias for OP pinwheel stabilization. Here we examine in numerical simulations whether such models exhibit biologically more realistic spatially irregular solutions at a finite distance from threshold and when transients towards attractor states are considered. We also examine whether model behavior qualitatively changes when the spatial periodicities of the two maps are detuned and when considering more than 2 feature dimensions. Our numerical results support the view that neither minimal energy states nor intermediate transient states of our coordinated optimization models successfully explain the spatially irregular architecture of the visual cortex. We discuss several alternative scenarios and additional factors that may improve the agreement between model solutions and biological observations.Comment: 55 pages, 11 figures. arXiv admin note: substantial text overlap with arXiv:1102.335

    DORA: Exploring outlier representations in Deep Neural Networks

    Full text link
    Deep Neural Networks (DNNs) draw their power from the representations they learn. In recent years, however, researchers have found that DNNs, while being incredibly effective in learning complex abstractions, also tend to be infected with artifacts, such as biases, Clever Hanses (CH), or Backdoors, due to spurious correlations inherent in the training data. So far, existing methods for uncovering such artifactual and malicious behavior in trained models focus on finding artifacts in the input data, which requires both availabilities of a data set and human intervention. In this paper, we introduce DORA (Data-agnOstic Representation Analysis): the first automatic data-agnostic method for the detection of potentially infected representations in Deep Neural Networks. We further show that contaminated representations found by DORA can be used to detect infected samples in any given dataset. We qualitatively and quantitatively evaluate the performance of our proposed method in both, controlled toy scenarios, and in real-world settings, where we demonstrate the benefit of DORA in safety-critical applications.Comment: 21 pages, 22 figure
    corecore